Bayesian classification for data from the same unknown class

نویسندگان

  • Hung-Ju Huang
  • Chun-Nan Hsu
چکیده

In this paper, we address the problem of how to classify a set of query vectors that belong to the same unknown class. Sets of data known to be sampled from the same class are naturally available in many application domains, such as speaker recognition. We refer to these sets as homologous sets. We show how to take advantage of homologous sets in classification to obtain improved accuracy over classifying each query vector individually. Our method, called homologous naive Bayes (HNB), is based on the naive Bayes classifier, a simple algorithm shown to be effective in many application domains. RNB uses a modified classification procedure that classifies multiple instances as a single unit. Compared with a voting method and several other variants of naive Bayes classification, HNB significantly outperforms these methods in a variety of test data sets, even when the number of query vectors in the homologous sets is small. We also report a successful application of HNB to speaker recognition. Experimental results show that HNB can achieve classification accuracy comparable to the Gaussian mixture model (GMM), the most widely used speaker recognition approach, while using less time for both training and classification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classical and Bayesian Inference in Two Parameter Exponential Distribution with Randomly Censored Data

Abstract. This paper deals with the classical and Bayesian estimation for two parameter exponential distribution having scale and location parameters with randomly censored data. The censoring time is also assumed to follow a two parameter exponential distribution with different scale but same location parameter. The main stress is on the location parameter in this paper. This parameter has not...

متن کامل

ارتقای کیفیت دسته‌بندی متون با استفاده از کمیته‌ دسته‌بند دو سطحی

Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...

متن کامل

دربارۀ شناسایی بیزیِ دنباله‌ای نقطۀ تغییر

The problems of sequential change-point have several important applications in quality control, signal processing, and failure detection in industry and finance and signal detection. We discuss a Bayesian approach in the context of statistical process control: at an unknown time  τ, the process behavior changes and the distribution of the data changes from p0 to p1. Two cases are consi...

متن کامل

A Validation Test Naive Bayesian Classification Algorithm and Probit Regression as Prediction Models for Managerial Overconfidence in Iran's Capital Market

Corporate directors are influenced by overconfidence, which is one of the personality traits of individuals; it may take irrational decisions that will have a significant impact on the company's performance in the long run. The purpose of this paper is to validate and compare the Naive Bayesian Classification algorithm and probit regression in the prediction of Management's overconfident at pre...

متن کامل

‎A Bayesian mixture model‎ for classification of certain and uncertain data

‎There are different types of classification methods for classifying the certain data‎. ‎All the time the value of the variables is not certain and they may belong to the interval that is called uncertain data‎. ‎In recent years‎, ‎by assuming the distribution of the uncertain data is normal‎, ‎there are several estimation for the mean and variance of this distribution‎. ‎In this paper‎, ‎we co...

متن کامل

روشی جدید برای عضویت‌دهی به داده‌ها و شناسایی نوفه و داده‌های پرت با استفاده از ماشین بردار پشتیبان فازی

Support Vector Machine (SVM) is one of the important classification techniques, has been recently attracted by many of the researchers. However, there are some limitations for this approach. Determining the hyperplane that distinguishes classes with the maximum margin and calculating the position of each point (train data) in SVM linear classifier can be interpreted as computing a data membersh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics : a publication of the IEEE Systems, Man, and Cybernetics Society

دوره 32 2  شماره 

صفحات  -

تاریخ انتشار 2002